Refine your search
Collections
Co-Authors
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Anand Kumar, M.
- Innovative Feature Sets for Machine Learning based Telugu Character Recognition
Abstract Views :208 |
PDF Views:0
Authors
Affiliations
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641 112, Tamil Nadu, IN
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641 112, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 8, No 24 (2015), Pagination:Abstract
In this Information age, all sources of information like historic documents, books, manuscripts are digitized and are available all over the world through internet in the form of scanned copies. These scanned images contain valuable information which are available either in colour or black and white for pleasant viewing. Optical Character Recognition (OCR) technology provides facility to search for keywords in these digital copies. In this paper, new method in which building an OCR system for Telugu language script; mainly focussing on the character recognition module. Features extracted through Discrete Wavelet Transform (DWT), Projection Profile (PP) and Singular Value Decomposition (SVD) is evaluated using k-Nearest Neighbour (k-NN) and Support Vector Machine (SVM) classifiers. Most productive results are obtained from the DWT features with SVM classifiers.Keywords
Discrete Wavelet Transform, K-nearest Neighbour, Optical Character Recognition, Singular Value Decomposition, Support Vector Machine, Telugu Character Recognition.- Randomized Kernel Approach for Named Entity Recognition in Tamil
Abstract Views :165 |
PDF Views:0
Authors
Affiliations
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641 112, Tamil Nadu, IN
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641 112, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 8, No 24 (2015), Pagination:Abstract
In this paper, we present a new approach for Named Entity Recognition (NER) in Tamil language using Random Kitchen Sink algorithm. Named Entity recognition is the process of identification of Named Entities (NEs) from the text. It involves the identifying and classifying predefined categories such as person, location, organization etc. A lot of work has been done in the field of Named Entity Recognition for English language and Indian languages using various machine learning approaches. In this work, we implement the NER system for Tamil using Random Kitchen Sink algorithm which is a statistical and supervised approach. The NER system is also implemented using Support Vector Machine (SVM) and Conditional Random Field (CRF). The overall performance of the NER system was evaluated as 86.61% for RKS, 81.62% for SVM and 87.21% for CRF. Additional results have been taken in SVM and CRF by increasing the corpus size and the performance are evaluated as 86.06% and 87.20% respectively.Keywords
Conditional Random Field (CRF), Named Entities (NEs), Named Entity Recognition (NER), Natural Language Processing (NLP), Random Kitchen Sink (RKS), Support Vector Machine (SVM)- Predicting the Sentimental Reviews in Tamil Movie using Machine Learning Algorithms
Abstract Views :216 |
PDF Views:0
Authors
Affiliations
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Coimbatore Amrita Vishwa Vidyapetham, Amrita University, Coimbatore – 641112, Tamil Nadu, IN
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Coimbatore Amrita Vishwa Vidyapetham, Amrita University, Coimbatore – 641112, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 45 (2016), Pagination:Abstract
Objective: This paper aims at classifying the Tamil movie reviews as positive and negative using supervised machine learning algorithms. Methods/Analysis: A novel machine learning approaches are needed for analyzing the Social media text where the data are increasing exponentially. Here, in this work, Machine learning algorithms such as SVM, Maxent classifier, Decision tree and Naive Bayes are used for classifying Tamil movie reviews into positive and negative. Features are also extracted from TamilSentiwordnet. Findings: The dataset for this work has been prepared. SVM algorithm performs well in classifying the Tamil movie reviews when compared with other machine learning algorithms. Both cross validation and accuracy of the algorithm shows that SVM performs well. Other than SVM, Decision tree perform well in classifying the Tamil reviews. Novelty/Improvement: SVM gives an accuracy of 75.9% for classifying Tamil movie reviews which is a good milestone in the research field of Tamil language.Keywords
Machine Learning, Maxent Classifier, Sentimental Analysis, Support Vector Machine, Tamil Language, TamilSentiwordnet.- Cuisine Prediction based on Ingredients using Tree Boosting Algorithms
Abstract Views :149 |
PDF Views:0
Authors
Affiliations
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amrita University, Coimbatore - 641112, Tamil Nadu, IN
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amrita University, Coimbatore - 641112, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 45 (2016), Pagination:Abstract
Objective: This paper aims at predicting the cuisine based on the ingredients using tree boosting algorithm. Methods/ Analysis: Text mining is important tool for data mining in Ecommerce websites. Ecommerce business is growing with significant rate both in Business-to-Business (B2B) and Business to Customer (B2C) categories. The machine learning based models and prediction method are used in real world ecommerce data to increase the revenue and study customer behavior. Many online cooking and recipe sharing websites have ardent to evolution of recipe recommendation system. In this paper, we describe a scalable end to end tree boosting system algorithms to predict cuisine based on the ingredients and also explored different data analysis and explained about the dataset types and their performances. Novelty/ Improvement: An accuracy of about 80% is obtained for cuisine prediction using XG-Boosting algorithm.Keywords
Data Analysis, Prediction, Random Forest, Text Analytics, XGBoost.- A Fast and Efficient Framework for Creating Parallel Corpus
Abstract Views :202 |
PDF Views:0
Authors
Affiliations
1 Centre for Computational Engineering and Networking (CEN),Amrita School of Engineering, Amrita University, Amrita Vishwa Vidyapeetham, Amritanagar, Coimbatore – 641 112, Tamilnadu, IN
1 Centre for Computational Engineering and Networking (CEN),Amrita School of Engineering, Amrita University, Amrita Vishwa Vidyapeetham, Amritanagar, Coimbatore – 641 112, Tamilnadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 45 (2016), Pagination:Abstract
Objectives: A framework involving Scansnap SV600 scanner and Google Optical character recognition (OCR) for creating parallel corpus which is a very essential component of Statistical Machine Translation (SMT). Methods and Analysis: Training a language model for a SMT system highly depends on the availability of a parallel corpus. An efficacious approach for collecting parallel sentences is the predominant step in an MT system. However, the creation of a parallel corpus requires extensive knowledge in both languages which is a time consuming process. Due to these limitations, making the documents digital becomes very difficult and which in turn affects the quality of machine translation systems. In this paper, we propose a faster and efficient way of generating English to Indian languages parallel corpus with less human involvement. With the help of a special type of scanner called Scansnap SV600 and Google OCR and a little linguistic knowledge, we can create a parallel corpus for any language pair, provided there should be paper documents with parallel sentences. Findings: It was possible to generate 40 parallel sentences in 1 hour time with this approach. Sophisticated morphological tools were used for changing the morphology of the text generated and thereby increase the size of the corpus. An additional benefit of this is to make ancient scriptures or other manuscripts in digital format which can then be referred by the coming generation to keep up the traditions of a nation or a society. Novelty: Time required for creating parallel corpus is reduced by incorporating Google OCR and book scanner.Keywords
Google OCR, Machine Translation, Parallel Corpus, Statistical Machine Translation, Scansnap SV600 Scanner.- Word Embedding Models for Finding Semantic Relationship between Words in Tamil Language
Abstract Views :197 |
PDF Views:0
Authors
Affiliations
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amrita University, Coimbatore - 641112, Tamil Nadu, IN
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amrita University, Coimbatore - 641112, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 45 (2016), Pagination:Abstract
Objective: Word embedding models were most predominantly used in many of the NLP tasks such as document classification, author identification, story understanding etc. In this paper we make a comparison of two Word embedding models for semantic similarity in Tamil language. Each of those two models has its own way of predicting relationship between words in a corpus. Method/Analysis: The term Word embedding in Natural Language Processing is a representation of words in terms of vectors. Word embedding is used as an unsupervised approach instead of traditional way of feature extraction. Word embedding models uses neural networks to generate numerical representation for the given words. In order to find the best model that captures semantic relationship between words, using a morphologically rich language like Tamil would be great. Tamil language is one of the oldest Dravidian languages and it is known for its morphological richness. In Tamil language it is possible to construct 10,000 words from a single ischolar_main word. Findings: Here we make comparison of Content based Word embedding and Context based Word embedding models respectively. We tried different feature vector sizes for the same word to comment on the accuracy of the models for semantic similarity. Novelty/Improvement: Analysing Word embedding models for morphologically rich language like Tamil helps us to classify the words better based on its semantics.Keywords
CBOW, Content based Word Embedding, Context based Word Embedding, Morphology, Semantic and Syntactic, Skip Gram.- Knowledge based Approach for English-Malayalam Parallel Corpus Generation
Abstract Views :162 |
PDF Views:0
Authors
Affiliations
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amrita University, Coimbatore - 641112, Tamil Nadu, IN
1 Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amrita University, Coimbatore - 641112, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 45 (2016), Pagination:Abstract
Objective: This paper aims in providing an overview about a part of Natural Language Generation – Parallel sentence generation which involves the generation of the English sentence as well as its Malayalam translated version. Methods/Analysis: A template based sentence generator approach is followed here. A system is proposed which takes input from a manually created bilingual dictionary and fills the slots in the template for parallel sentence generation. Finding: Using the proposed method, we have generated a total of 25,208 parallel sentences. This can be used in bilingual Machine Translation dictionary. Application/Improvement: In the proposed case use only four templates but by increasing the number of templates and by updating the dictionary, we can increase the size of the parallel corpus that can be generated.Keywords
Bilingual, English-Malayalam, Machine Translation, Parallel sentence, Templates.- Green based Software Development Life Cycle Model for Software Engineering
Abstract Views :149 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
2 Department of Information Technology, Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
1 Department of Computer Science, Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
2 Department of Information Technology, Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 32 (2016), Pagination:Abstract
Objectives: The main objectives of this research work are to propose a new green based model for Software engineering with no impact in the environment. Analysis: In the beginning stages of ICT manufacturing companies and software developing companies are given more imperative for Hardware and software. They don't think about the sustainability of resource used for manufacturing of a product or development of an application from the statistical analysis it was identified that there is a need to minimize the energy utilization and relate to CO2 emission of IT equipment. Result: The experimental results verified that proposed model out performed than the existing software engineering model with little impacts on the environmental considerations. Improvement/Application: Experiments revealed that the proposed Green Based SDLC technique is able to reduce the power consumption.Keywords
Capability Maturity Model, Environment and Development, Green Power Indicator, Green Tracker, Non Function Requirements, Sustainable Software Development Life Cycle.- Green Database Design Model in Software Development Life Cycle Phase
Abstract Views :137 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Karpagam University, Coimbatore – 641021, Tamil Nadu, IN
2 Department of Information Techonology, Karpagam University, Coimbatore – 641021, Tamil Nadu, IN
1 Department of Computer Science, Karpagam University, Coimbatore – 641021, Tamil Nadu, IN
2 Department of Information Techonology, Karpagam University, Coimbatore – 641021, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 30 (2016), Pagination:Abstract
Background/Objectives: This paper proposes to introduce a Green Software Database Design Model and to create the awareness to overcome the energy consumption issues while designing the database in the prevailing state of the world. Methods/Statistical Analysis: The main aspect of the work is to contemplate the energy consumption during the design phase of the database. Findings: This approach is concerned with the database designing using Green Computing Technique to reduce the power utilization pattern of server at different work load conditions. The energy consumption database design is subsequently designed to estimate the collision of software applications based on their source utilization. Improvement/Application: The work is validated on the side of the desktop and the server side. This experiment demonstrates the effectiveness of the database design that provides the relevant information about the energy utilization of software application design on the database in software engineering.Keywords
Database Design, Energy Consumption, Green Software Engineering, Software Application, Software Engineering- Survey on Identifying Packet Misbehavior in Network Virtualization
Abstract Views :141 |
PDF Views:0
Authors
S. Reshmi
1,
M. Anand Kumar
2
Affiliations
1 Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
2 Department of Information Technology, Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
1 Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
2 Department of Information Technology, Karpagam University, Coimbatore - 641021, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 31 (2016), Pagination:Abstract
Background/Objectives: The pros in using network virtualization for the users and the resources offers effectual, meticulous, and protected sharing of the networking resources. Methods/Statistical Analysis: In network there is a problem of accountability that any malicious router can drop packets that are supposed to impart packets instead of throwouts. To understand the packet dropping issues in detail this paper recognizes the foremost attacks and to tackle these attacks algorithms are initiated. Findings: A concise assessment on two major attacks are dealt in this article: black hole attack and gray hole attack. If there is any malevolent node in the network, the number of data packets is not reaching the destination, since the packets dive in middle path. To overcome these issues, we identify proposed mechanisms against the attacks and improve the network recital in terms of package globule degree. Applications/Improvements: Heuristics algorithm and obfuscation algorithm are the algorithms which help in exploring lost packets in network while transmitting to end users.Keywords
Black Hole Attack, Gray Hole Attack, Heuristics Algorithm, Network Virtualization, Obfuscation Algorithm, Virtual Routing and Forwarding.- Simulated and Self-Sustained Classification of Twitter Data based on its Sentiment
Abstract Views :148 |
PDF Views:0
Authors
Affiliations
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, IN
1 Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641112, Tamil Nadu, IN